7 research outputs found

    BRASERO: A Resource for Benchmarking RNA Secondary Structure Comparison Algorithms

    Get PDF
    The pairwise comparison of RNA secondary structures is a fundamental problem, with direct application in mining databases for annotating putative noncoding RNA candidates in newly sequenced genomes. An increasing number of software tools are available for comparing RNA secondary structures, based on different models (such as ordered trees or forests, arc annotated sequences, and multilevel trees) and computational principles (edit distance, alignment). We describe here the website BRASERO that offers tools for evaluating such software tools on real and synthetic datasets

    Pareto optimization in algebraic dynamic programming

    Get PDF
    Saule C, Giegerich R. Pareto optimization in algebraic dynamic programming. Algorithms for Molecular Biology. 2015;10(1): 22.Pareto optimization combines independent objectives by computing the Pareto front of its search space, defined as the set of all solutions for which no other candidate solution scores better under all objectives. This gives, in a precise sense, better information than an artificial amalgamation of different scores into a single objective, but is more costly to compute. Pareto optimization naturally occurs with genetic algorithms, albeit in a heuristic fashion. Non-heuristic Pareto optimization so far has been used only with a few applications in bioinformatics. We study exact Pareto optimization for two objectives in a dynamic programming framework. We define a binary Pareto product operator ∗Par on arbitrary scoring schemes. Independent of a particular algorithm, we prove that for two scoring schemes A and B used in dynamic programming, the scoring scheme A∗ParB correctly performs Pareto optimization over the same search space. We study different implementations of the Pareto operator with respect to their asymptotic and empirical efficiency. Without artificial amalgamation of objectives, and with no heuristics involved, Pareto optimization is faster than computing the same number of answers separately for each objective. For RNA structure prediction under the minimum free energy versus the maximum expected accuracy model, we show that the empirical size of the Pareto front remains within reasonable bounds. Pareto optimization lends itself to the comparative investigation of the behavior of two alternative scoring schemes for the same purpose. For the above scoring schemes, we observe that the Pareto front can be seen as a composition of a few macrostates, each consisting of several microstates that differ in the same limited way. We also study the relationship between abstract shape analysis and the Pareto front, and find that they extract information of a different nature from the folding space and can be meaningfully combined

    Integrating Pareto Optimization into Dynamic Programming

    No full text
    Gatter T, Giegerich R, Saule C. Integrating Pareto Optimization into Dynamic Programming. ALGORITHMS. 2016;9(1): 12.Pareto optimization combines independent objectives by computing the Pareto front of the search space, yielding a set of optima where none scores better on all objectives than any other. Recently, it was shown that Pareto optimization seamlessly integrates with algebraic dynamic programming: when scoring schemes A and B can correctly evaluate the search space via dynamic programming, then so can Pareto optimization with respect to A and B. However, the integration of Pareto optimization into dynamic programming opens a wide range of algorithmic alternatives, which we study in substantial detail in this article, using real-world applications in biosequence analysis, a field where dynamic programming is ubiquitous. Our results are two-fold: (1) We introduce the operation of a Pareto algebra product in the dynamic programming framework of Bellman's GAP. Users of this framework can now ask for Pareto optimization with a single keystroke. Careful evaluation of the implementation alternatives by means of an extended Bellman's GAP compiler demonstrates the dependence of the best implementation choice on the application at hand. (2) We extract from our experiments several pieces of advice to programmers who do not use a system such as Bellman's GAP, but who choose to hand-craft their dynamic programming recurrences, incorporating Pareto optimization from scratch

    Benchmarking RNA secondary structure comparison algorithms

    No full text
    International audienceIn the last ten years, several tools have been proposed for RNA secondary structure pairwise comparison. These tools use different models (ordered tree or forest, arc annotated sequence, multi-level tree) and methods (edit distance, alignment). We present a ïŹrst benchmark for comparing these tools. For various RNA families, we built two sets of secondary structures. The ïŹrst, called the reference set, is composed of a small number of RNAs with their known structures. The second is composed of sequences folded using Mfold and RNAshapes. Some of these sequences correspond to structural RNAs of the same families (true events), other correspond to noise. We studied the ability of each tool to ïŹnd the true events using the reference set. In particular we analysed the results in terms of sensibility/speciïŹcity, distribution and spread of the scores, and computation time

    Benchmarking RNA secondary structure comparison algorithms

    No full text
    International audienceIn the last ten years, several tools have been proposed for RNA secondary structure pairwise comparison. These tools use different models (ordered tree or forest, arc annotated sequence, multi-level tree) and methods (edit distance, alignment). We present a ïŹrst benchmark for comparing these tools. For various RNA families, we built two sets of secondary structures. The ïŹrst, called the reference set, is composed of a small number of RNAs with their known structures. The second is composed of sequences folded using Mfold and RNAshapes. Some of these sequences correspond to structural RNAs of the same families (true events), other correspond to noise. We studied the ability of each tool to ïŹnd the true events using the reference set. In particular we analysed the results in terms of sensibility/speciïŹcity, distribution and spread of the scores, and computation time

    Benchmarking RNA secondary structure comparison algorithms

    No full text
    International audienceIn the last ten years, several tools have been proposed for RNA secondary structure pairwise comparison. These tools use different models (ordered tree or forest, arc annotated sequence, multi-level tree) and methods (edit distance, alignment). We present a ïŹrst benchmark for comparing these tools. For various RNA families, we built two sets of secondary structures. The ïŹrst, called the reference set, is composed of a small number of RNAs with their known structures. The second is composed of sequences folded using Mfold and RNAshapes. Some of these sequences correspond to structural RNAs of the same families (true events), other correspond to noise. We studied the ability of each tool to ïŹnd the true events using the reference set. In particular we analysed the results in terms of sensibility/speciïŹcity, distribution and spread of the scores, and computation time

    BRASERO: A resource for benchmarking RNA secondary structure comparison algorithms

    Get PDF
    International audienceThe pairwise comparison of RNA secondary structures is a fundamental problem, with direct application in mining databases for annotating putative noncoding RNA candidates in newly sequenced genomes. An increasing number of software tools are available for comparing RNA secondary structures, based on different models (such as ordered trees or forests, arc annotated sequences, and multilevel trees) and computational principles (edit distance, alignment). We describe here the website BRASERO that offers tools for evaluating such software tools on real and synthetic datasets
    corecore